Effective error recovery strategies for multimodal form-filling applications

نویسندگان

  • Janienke Sturm
  • Lou Boves
چکیده

The goal of the research described in this article is to determine in what way speech recognition errors can be handled best in a multimodal form-filling interface. Besides two well-known error correction mechanisms (re-speaking the value and choosing the correct value from a list of alternatives), the interface offers a novel correction mechanism in which the user selects the first letter of the target word from a soft-keyboard, after which the utterance is recognized once again, with a limited language model and lexicon. The multimodal interface that was used is a web-based form-filling GUI, extended with a speech overlay, which allows for pen and speech input. The effectiveness and efficiency of the error correction mechanisms, the error correction strategies that are applied by the users and the effects on user satisfaction were studied in an evaluation in which the interface was tested in two conditions: in one condition (LIST), the interface provides only re-speaking and the alternatives list as error correction facilities. In the other condition (LETTER), the interface provides the soft-keyboard technique as an additional error correction facility. The study shows that error correction was more effective in the LETTER condition than in the LIST condition. The Keyboard correction facility enables the users to solve errors that could not be solved using the Re-speak method or by choosing from a list of alternatives. In spite of its low effectiveness, subjects initially attempted to use Re-speaking for error correction in both interfaces. However, we also found that subjects rapidly learned to choose the most effective option (Keyboard) immediately as they gain experience. The user satisfaction turned out to be higher for the LETTER interface than for the LIST interface: subjects considered the LETTER interface to be more useful and less frustrating and they felt more in control. As a result, most subjects clearly preferred the LETTER interface. 2005 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling different decision strategies in a time tabled multimodal route planning by integrating the quantifier-guided OWA operators, fuzzy AHP weighting method and TOPSIS

The purpose of Multi-modal Multi-criteria Personalized Route Planning (MMPRP) is to provide an optimal route between an origin-destination pair by considering weights of effective criteria in a way this route can be a combination of public and private modes of transportation. In this paper, the fuzzy analytical hierarchy process (fuzzy AHP) and the quantifier-guided ordered weighted averaging (...

متن کامل

Getting rid of "OK Google": Individual Multimodal Input Adaption in Real World Applications

Multimodal Interaction has the potential to significantly increase the ease of use of human computer interaction (HCI). At the same time, due to error-prone recognition based inputs, it is merely used in real-world applications. While literature on multimodal input fusion describes modeling of different user behaviors as a key for increased robustness, it still failed to prove it’s practical us...

متن کامل

How Finite State Machines Can Be Used to Build Error Free Multimodal Interaction Systems

Recognition-based interaction technologies (e.g. speech and gesture recognition) are still error-prone. It has been shown that, in multimodal architectures, combining complementary input modes can contribute to automatic recovery from recognition errors. However, the degree to which error recovery can be achieved is dependent on the design of the interaction, i.e. on the set of multimodal const...

متن کامل

Achieving Multimodal Cohesion during Intercultural Conversations

How do English as a lingua franca (ELF) speakers achieve multimodal cohesion on the basis of their specific interests and cultural backgrounds? From a dialogic and collaborative view of communication, this study focuses on how verbal and nonverbal modes cohere together during intercultural conversations. The data include approximately 160-minute transcribed video recordings of ELF interactions ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2005